Learning Methods to Combine Linguistic Indicators: Improving Aspectual Classification and Revealing Linguistic Insights

نویسندگان

  • Eric V. Siegel
  • Kathleen McKeown
چکیده

Aspectual classification maps verbs to a small set of primitive categories in order to reason about time. This classification is necessary for interpreting temporal modifiers and assessing temporal relationships, and is therefore a required component for many natural language applications. A verb's aspectual category can be predicted by co-occurrence frequencies between the verb and certain linguistic modifiers. These frequency measures, called linguistic indicators, are chosen by linguistic insights. However, linguistic indicators used in isolation are predictively incomplete, and are therefore insufficient when used individually. In this article, we compare three supervised machine learning methods for combining multiple linguistic indicators for aspectual classification: decision trees, genetic programming, and logistic regression. A set of 14 indicators are combined for classification according to two aspectual distinctions. This approach improves the classification performance for both distinctions, as evaluated over unrestricted sets of verbs occurring across two corpora. This demonstrates the effectiveness of the linguistic indicators and provides a much-needed full-scale method for automatic aspectual classification. Moreover, the models resulting from learning reveal several linguistic insights that are relevant to aspectual classification. We also compare supervised learning methods with an unsupervised method for this task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corpus-Based Linguistic Indicators for Aspectual Classification

Fourteen indicators that measure the frequency of lexico-syntactic phenomena linguistically related to aspectual class are applied to aspectual classification. This group of indicators is shown to improve classification performance for two aspectual distinctions, stativity and completedness (i.e., telicity), over unrestricted sets of verbs from two corpora. Several of these indicators have not ...

متن کامل

Incorporating Cognitive Linguistic Insights into Classrooms: the Case of Iranian Learners’ Acquisition of If-Clauses

Cognitive linguistics gives the most inclusive, consistent description of how language is organized, used and learned to date. Cognitive linguistics contains a great number of concepts that are useful to second language learners.  If-clauses in English, on the other hand, remain intriguing for foreign language learners to struggle with, due to their intrinsic intricacies. EFL grammar books are ...

متن کامل

Learning Methods for Combining Linguistic Indicators to Classify Verbs

Fourteen linguistically-motivated numerical indicators are evaluated for their ability to categorize verbs as either states or events. The values for each indicator are computed automatically across a corpus of text. To improve classification performance, machine learning techniques are employed to combine multiple indicators. Three machine learning methods are compared for this task: decision ...

متن کامل

Applied Linguistic Approach to Language Learning Strategies (A Critical Review)

From applied linguistic point of view, the fundamental question facing the language teachers, methodologists and course designers is which procedure is more effective in FL/SL: learning to use or using to learn? Definitely, in order to be a competent language user, knowledge of language system is necessary, but it is not sufficient to be a successful language user. That is why there was a gradu...

متن کامل

Classification of telicity using cross-linguistic annotation projection

This paper addresses the automatic recognition of telicity, an aspectual notion. A telic event includes a natural endpoint (she walked home), while an atelic event does not (she walked around). Recognizing this difference is a prerequisite for temporal natural language understanding. In English, this classification task is difficult, as telicity is a covert linguistic category. In contrast, in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2000